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^"5 Abstract 

We consider two theorems from the theory of compressive sensing. 
Mainly a theorem concerning uniform recovery of random sampling ma- 
trices, where the number of samples needed in order to recover an s-sparse 
HH signal from linear measurements (with high probability) is known to be 

r/3 m > s(lns) 3 In N. We present new and improved constants together with 

O what we consider to be a more explicit proof. A proof that also allows for 

a slightly larger class of m x A-matrices, by considering what we call low 
entropy. We also present an improved condition on the so-called restricted 
isometry constants, 5 S , ensuring sparse recovery via ^-minimization. We 
show that b~2s < 4/v41 is sufficient and that this can be improved further 
QQ to almost allow for a sufficient condition of the type 82s < 2/3. 

Keywords: compressive sensing, ^-minimization, random sampling matrices, 
bounded orthogonal systems, restricted isometry property 
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1 Introduction 

The theory of compressive sensing has emerged over the last 6-8 years, with the 
^> results we will consider originally presented by Tao, Candes et.al. in [5] and [4]. 

Rudelson and Vershynin improved the results in |15| and further generalizations 
where made by Rauhut in |14j . which also offers a nice overview of the topic. 
Today there is a vast literature on the topic of which the authors would also like 
to mention also [7] and [SJ. Spanning a wide range of results, we do not aim to 
do a rigorous overview here but instead refers to mentioned papers from where 
we have gathered a lot of inspiration and where many further references can be 
found. 

The beginning of section |3j provides only a brief introduction to the topic with 
concepts that should be familiar to those that have encountered compressive 
sensing before. At the end of the section we present an improved version of a 
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theorem from [13] . regarding when the restricted isometry property implies the 
null space property. 

In section [4] the most important inequalities and lemmas, to be used in the 
proof of the main results of section [5] is presented. This section could possibly 
be skipped by readers familiar with the topic. 

Our main concern will be the theorem of uniform recovery for random sampling 
matrices. To our knowledge the best result known is due to Cheraghchi, Gu- 
ruswami and Velingker in [12] . The theorem is stated to hold for the special case 
of a discrete Fourier matrix, but the authors remark that it also goes through for 
bounded orthonormal matrices. The result is the best in terms of asymptotics, 
and we will re-use a lot of their arguments but also provide constants that are 
improved compared with earlier results that we have encountered. We feel that 
our proof is more explicit in some ways, which we hope can offer more under- 
standing of the techniques. First, in section[2] we go into more detail about the 
differences and similarities of our work compared to the other mentioned ones. 



2 Comparisons with previous results 



In |12j . the following version of theorem 5.2 is proved (using our notations and 
terminology) : 

Theorem 2.1 ([12J, Theorem 19). Let A g C mx be an orthonormal matrix 
with entries bounded by 0(l/yN). Then for every 5, e > and N > N (S,e), 
with probability at least 1 — e the restricted isometry constants S s of y^N/mA 
are less than S for some m satisfying 

m < ^M s (l ns )3 ln7V . 

Here / < g means that there exists a constant C > such that / < Cg. In 
comparison we have achieved 

m> 4 ( (In s) 3 In TV + In f I ) ) . (1) 



S 2 



In the sense that theorem |2.1| is summarized in their paper, namely that the 
number of samples needed is of order s(ln s) 3 In N, we have not made any contri- 
bution (i.e. with regards to the asymptotics). However we think that for small 
e the improvement is not insignificant. We do as well allow for a larger class of 
matrices and provide explicit constants. When constants have been presented 
before (for actually worse results in terms of asymptotics), as far we have seen 
they have been about a factor 10 larger than ours. 

The main differences in the proofs lies in the arguments surrounding Dudley's 
inequality for Rademacher processes and that we do not make use of two dif- 
ferent covering number estimates. The inequality requires a quite heavy proof, 
using probabilistic methods, c.f. [llj . We re-use some of the arguments in that 
proof, but we first do pointwise estimates and then simply replace supremums 
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with sums. One must take care when doing the covering and counting, details 
that we hope are perhaps a bit more clear through our exposition. 



3 Preliminaries 

We denote by || • || p , 1 < p < oo the usual l p norm for vectors, ||z|| := | suppz| 
denotes the cardinality of the support of a vector z (sometimes called "0- 
norm", despite not being a norm) and [N] = {1, 2, . . . , N}. In this work we 
will mostly restrict ourselves to vectors with real entries but one could easily 
generalize the results to complex vectors. By E x we denote the expectation 
value with respect to a random variable, or random vector, X. In particular 
for the random sampling matrices with rows X = {Xj}jL 1 we will use E to 
mean Ex = E^ Kx 2 ' ' ' ^x m and otherwise be clear with subscripts if the 
expectation is taken in another random variable. Given a random variable X 
and a measurable function /, we can for 1 < p < oo induce the L p -norms 



3.1 Sparsity and Restricted Isometry 

We start by defining what we mean by a sparse vector. In what follows, N 
denotes a (usually large) positive integer. 

Definition 3.1. x <G is called s-sparse if ||x||o < s. 

The next definition will be of great use throughout this paper. 

Definition 3.2. Ifx = (xi, . . . ,xn), S C [N], we definexs = ((xs)i, • • • , (xs)n) 
by (x s )k = x k xs(k), where 



is the characteristic function of the set S. Clearly x = xg + xgc, where S c — 



In practice one rather accepts small "s-term approximation error", i.e. one 
wants that the following quantity is small: 



Think of y G C n as the measured quantity from a measurement of x e C N , 
modelled after y = Ax, where A e c mxJV is an m x TV-matrix and we assume 
that m <§; N. In general this system is impossible to solve, unless we impose 
the extra condition that x is s-sparse and consider 



ll/lk: 



= E x [\f(X)\P] 1 /P. 




1, ifkeS 

0, otherwise 



[N]\S. 



(T s (x) p := inf{||x— z|| p ,z is s-sparse}. 




min ||z|| subject to Az = y, 



(2) 
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in the hope that its solution x* = x. This is still very hard to solve in general 
so one would like to consider the closest convex relaxation of (|2|), which is 

min ||z||i subject to Az = y. (3) 

We ask when the solution of ^ is equivalent to the solution of The key 
notion is the so-called null space property for a matrix. 

Definition 3.3. A matrix A g Q mxN satisfies the null space property of order 
s if for all subsets S C [N] with \S\ = s it holds that 

hs\\i < ||v s .||i for all v e ker A \ {0}. (4) 

We write A £ NSP{s). 

The following theorem gives the answer to when a solution of ^ equals the 
solution of ([3]), for the proof see for example [H] (Theorem 2.3, p. 8) or [§]. 

Theorem 3.4. Let A G C mxN . Then every s-sparse vector x £ C N is the 
unique solution to the i 1 -minimization problem ^ with y = Ax if and only if 
A satisfies the null space property of order s. 

Below we present a helpful proposition that can be used to verify the null space 
property. The proof is a simple consequence of Lemma [6~3| in the appendix where 
we sketch out the details. With a slightly more involved proof the propostion 
could be improved a bit further, replacing the constant 4/5 with a constant 
arbitrarily close to (for large s) -y/4/5. See further section 
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Proposition 3.5. Assume x = (xi, . . . ,xn) € such that \x±\ > ja^l > 
■ • • > \xn\- Write x = J2k x S'fc where Si — {1, . . . , s}, S2 = {s + 1, . . . , 2s} etc. 
so that \Sk\ — s (except for possibly the last k). Denote by S c = [N] \ S. Then 
if ' " 

||x Sl || 2 < H X sJl2, 
fc>l 

it holds that ||xs||i < Hxgcl^ for all subsets S C [N] with \S\ = s. 

Unfortunately, the null space property is often hard to verify. Instead one 
usually tries to verify the weaker restricted isometry property for a matrix. 

Definition 3.6. The restricted isometry constants S s of a matrix A £ Q"ix^v 
is defined as the smallest 5 S such that 

(l-^)||x||l<Px||l<(l + ^)||x||| (5) 

for all s-sparse x £ C N . We abbreviate this by A £ RIP(S S ). 

Another characterization of the restricted isometry constants is given by: 
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Proposition 3.7 ([H]:2.5 (p.9)). Let A G c mxJV , with restricted isometry 
constants 6 S , then 

8 S = sup \{(A*A- I)x,x)|, w/iere T s = {x G C^, ||x|| 2 = 1, ||x|| < s}. 

x£T 3 

The restricted isometry property can, under some extra condition, imply the 
null space property as the following theorem suggests. 

Theorem 3.8. Suppose the restricted isometry constants 62s of a matrix A G 
C rnxN satisfies 




then the null space property of order s is satisfied. In particular, every s-sparse 
vector x G C N is recovered by I 1 -minimization. 

This is an improvement of the best known result, from |13| . which had 
o~2s < 0.4931 (see also [H],[2],[I])- The proof will be included in the appendix. 
With some more work the authors can replace the constant 4/\/41 with a con- 
stant, arbitrarily close to for large s, 2/3. The key ingredient is the mentioned 
improvement of proposition |3.5| See further section |6.2| The best we can hope 
for is to replace the constant with 1 / V2, due to the work in [5] . 



3.2 Entropy and Low Entropy Isometry 

Next we will define the i 1 -entropy (also known as the i 1 -sparsity level, as defined 
in for example [16]) which is closely related to sparseness. 

Definition 3.9. By the I 1 -entropy of a nonzero vector x G R" we mean the 
quantity 



Ent(x) = 




Remark 3.10. Clearly if x is s-sparse then Ent(x) < s by Cauchy-Schwarz 
inequality. 

In replacement of null space property, one has the null entropy property. 

Definition 3.11. A matrix A G C nxN satisfies the null entropy property 
of order t if for every x G her A \ {0} it holds that Ent(x) > t. We write 
A G NEP(t). 

A low entropy isometry property can be defined as well, analogous with the 
restricted isometry property. 

Definition 3.12. A matrix A G C mx satisfies the low entropy isometry prop- 
erty with constants 5t if for all x with Ent(x) < t. 

|||^x||l-Wl|<^||x||I. 

We abbreviate this by A G LEIP(5 t ). 
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Many of the above notions are related by the following proposition: 
Proposition 3.13. 

1. Ift> 4s and A G NEP(t) then A G NSP(s). 

2. If S t < 1 and A G LEIP(S t ), then A G NEP(t). 

3. Ifs<tandAe LEIP(S t ), then A G RIP(S S ) for some S s < S t . 

A variant of i can be found in [TB], and both that and 2 can be proved on 
a single line by considering the contrapositive statements while 3 is obvious. 

3.3 Bounded orthonormal systems 

Let T> C R d , v a probability measure on T>, {if>j}j—i a bounded orthonormal 
system of complex- valued functions on T>. This means that for j, k G [N], 

[ i>j{t)^ k {f)du{f) = S jk , (6) 
Jv 

and {ipj} is uniformly bounded in L°°, 

||^|U=sup|V>i(i)|<i<r forallje[iV],(X>l). (7) 
v 

Let now t% . . .t m G T> (picked independently at random with respect to v) and 
suppose we are given sample values 

N 

Ui = f(ti) = ^2x k ipj(ti), i = l,...,m. 

k=l 

Introduce A G C mxN , A = (a /fc ), a tk = tp k (U), I = 1, . . . , m; k = 1, . . . , N. Then 
y = Ax, y = (z/i, . . . , y m ) T and x is a vector of coefficients. We wish to recon- 
struct the polynomial / (or equivalently x) from the samples y, using as few 
samples as possible. If we assume that / is s-sparse (defined to be so if x is 
s-sparse) the problem reduces to solving y = Ax with a sparsity constraint. 
P(ti G B) = v(B) for measurable B C T>, so A becomes a random sampling 
matrix (fulfills (|6|,([7]) and t\ are picked independently at random with respect to 
v ) . One interesting example is given by sampling m rows from the TV x iV-matrix 

e 2mlk/N 

aik = r^—> l ' k e i N }- 

v N 

This matrix is called a random partial Fourier matrix. We summarize this 
section with a definition of the matrices we will continue to study. 

Definition 3.14 (Random Sampling Matrix). A matrix A G C mxN is said to 
be a random sampling matrix if its rows X = {Xj}JL 1 fulfills the conditions: 

1- Halloo < K for some K > 1. 

2. E [X*Xj] = In (N x N identity matrix), for all j . 
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4 Preparatory lemmas and inequalities 



We move on to present some key ingredients to be used in the proof of the main 
theorem of this paper. First we remind about the definition of a Rademacher 
sequence. 

Definition 4.1. A Rademacher sequence e = (sj)JL 1 is a random vector whose 
components Ej takes the values ±1 with equal probability (= ~). 

Symmetrization is a useful technique that will later be used to bound the 
expectation value of the restricted isometry constants 5 S . The proof of the 
proposition is not very hard and can be found in for example |10j or [14] . 

Proposition 4.2 (Symmetrization). Assume that £ = * s a sequence of 

independent random vectors in C N equipped with a (semi-) norm || • ||, having 
expectations Xj = E £j . Then for 1 < p < oo 

)i/ P / \ i/p 

I m 
<2 Ellg^f 
V j=i 

where e = (sj)JLi is a Rademacher sequence independent of £. 

Khintchine's inequality is another important inequality to be used later on. 

Proposition 4.3 (Khintchine's inequality). Suppose x = (xi, . . . ,xn) G C n 
and s = (si, . . . , £jv) is a vector whose components are independent Rademacher 
random variables, then for p > 2 



E F 



N 



P <2^(lY ,2 Ml . (8) 



The proof can be found in a lot of literature, see for example [14] . p. 35 



4.1 Covering and packing estimates 

We will work in the framework of a random sampling matrix (with rows X = 
{Xj}JL 1 , \\XjWoo < K) and introduce the metric 

(i m 
3=1 

Bx, P (x, r) = {y € R w : dx, P (x, y) < r} denotes the ball of radius r > around 
x £ R w with respect to the metric <ix, P - The next lemma is based on the 
method of Maurey. 
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Lemma 4.4 (Covering lemma 1). Let < r < K, p > 1, 

, r 3 8pi4T 2 



(9) 



and let Gm = { z j} be the set of grid points in the i 1 unit cube with mesh size 
jj, i.e. the set of points satisfying ||z||i < 1 and Mz £ 7i N . Then B\ = 
{z £ R N ; ||z||i < 1} is contained in UjBx_ : 2p( z j, r ) for some fix realization of 
X = {Xj}, with the property \\Xjl\rx, < K and r given by equality in (J9|. The 
number of grid points is less than 



2N + M 
M 



< 



(2Ne 



A I 



Proof of lemma 4-4 Fix a point in x = (xi, . . . , Xjy) G B% and define a random 
vector Z — (z\, ■ ■ ■ , zjv) by letting it take the value sgn(xj)ej with probability 



and Z = with probability 1 — ||x||i (so ||Z||o < 1). Let now Z^^k 



1, . . . , M be M independent copies of Z and define 

M 

M 



1 M 



k=l 



Then z g Gm and Ez z = x. Now it is enough to prove that 
^ z f^\{ Xj ,z-^<r^ 

3=1 

for some p > 1. By symmetrization and Khintchine's inequality applied to every 
term, 



-^E z |(X, ; z-x)| 2 *>< -]T2 2 *>E Z E £ iV^^.Z, 



j'=i 



3=1 
M 



fc=l 



2/j 



ig(^r^(?)'-.gi«.* 



< 



< 



K 2p 



,.2p 



The number of balls needed for the cover follows from simple combinatorics. 
We can choose M vectors out of the collection {ie^}^^ U {0} in less than 
( 2W+ M ~ ) wa Y s (i- e we count the number of unordered selections with repe- 



tition allowed). It is also well-known that 



2N + M 
M 



< 



/2Ne 



A I 



□ 
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Remark 4.5. We will use Lemma 4-4 f or z € Bi(0, t/s),M — 2 2k , so the radii 
of the balls in the cover will then be 



1/2 



and the number of balls in the cover (the covering number) for this k will be 

5 Uniform recovery theorem 

The following technical lemma is going to be the key ingredient and we postpone 
the rather involved proof until the end of this section. 

Lemma 5.1. Let A G C mxN be a random sampling matrix with corresponding 
low entropy isometry (or restricted isometry) constants S s and rows {Xj}JL 1 
having the properties that WXjW^ < K for some K > 1 and E [X?Xj] = 7jy for 
all j. Suppose that N > Ap,p = ln(2 3 / 4 if 2 s) > 2,0 < A, g < 1, then 

(E5 2n )^ < (H + Xg)((ES 2n )^ +1)^ 
where q — q(K. s) G (1, 2] and 

H = H(N,K,m,s,X,g) = 

2 w K 2 s\ 1/2 ( ,,„, /2 6 c^ 2 s\ ,/„ \ eln2 



(ln2) 2 m 



P 1/2 In ( 2 ^) ln 1/2 (N/ P ) + ln^l/o) , a = 



Using lemma 5.1 we can prove 



Theorem 5.2. Let A G C mxN be a random sampling matrix with corresponding 
low entropy isometry (or restricted isometry) constants S s and rows {Xj}" l =1 
having the properties that \\XjWoo < K for some K > 1 and E = In for 

all j. Suppose < S, e, A < 1 and 

yftn > CiKyfs (ln 1/2 (2 3 / 4 if 2 s) ln(C 2 K 2 s) ln 1/2 (iV) + ln 1/2 (l/e)) (10) 

where 

2«eV4 (V-c + 5)V 2 2^ 2 (S + y-c) 
Cl(M)= In 2 (1-\)S ' C2 ^ A ) = 

Then P(S S > S) < e, that is ^7=^ has the low entropy (or restricted) isometry 
property with constants 5 S < 5 with probability 1 — e. 
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Proof. Since in our framework s < m <C N, by Markov's inequality, for any 
n > 0, 

P(S S >S) = P(S 2n > S 2n ) < < 6. 



By lemma 5.1 this is less than e € (0, 1) if, 

H + Xg< - f £2 " -r ■ (ii) 

(Jess + l) 2 ? 

Choosing n > In (~) implies that > and with this choice (111 is easily 
seen to be implied by 

H + Xg< 6 (12) 
Define the right hand side expression to be g = g(6), then 

H < (1 - %(£) <=► 
2 10 iT 2 s \ 1/2 / 1/0 , /2 6 eATV 



(ln2) 2 m 



>/ a ln ( 2 ( ^ S ) ln 1/2 W P ) + ln 1/9 (l/e a )) < (1-A) fl 



Since a < 1, and by removing some lower order terms, ( 10 ) can be seen to imply 
(12), so we are done. □ 

Remark 5.3. We could modify the proof, choosing n larger so that e 1 / 2 ™ comes 
arbitrarily close to 1, compared to above where we only used e -1 / 2 as lower 
bound. This corresponds to constants we would get by doing an argument closer 
to what is done for the best result in for example where the so-called devi- 

ation inequality is used. 

If we introduce 

C(5, A) = V2d(6, A), D(S, A) = 2C 2 (S, A) 



we get together with theorem |3.8| the following corollary to theorem 5/2 

Corollary 5.4. Let A 6 c mxJV oe a random sampling matrix with correspond- 
ing restricted isometry constants 6 S and rows {Xj}"^ having the properties that 
\\Xj Hoc < K for some K > 1 and E [XjXj] = 1^ for all j . Suppose < e, A < 1 
and if 

^T l>C (-^.A) X^ln 1 / 2 (2 3 / 4 X 2 S )ln (d K 2 s^j \n l ' 2 {N/p)+ 

,A^^ln 1/2 (l/e) (13) 



C 



4 



41 



then with probability 1 — e, the matrix ~^^A satisfies the null space property of 
order s. 
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Remark 5.5. Another variant of the above corollary would be to inst ead de mand 
that the low entropy isometry constants S^s < 1 and use proposition 



3.13 



Below we present tables of values of C 2 (for convenience these are easier to 
compare with older results) and D for some interesting choices of 5 and A. 



Table 1: Some values of C(6, A) 2 , D(6, A). 



A 


c (-m4 




'c(l,xf 







40943 


oo 


36613 


oo 


1/9 


51818 


270695 


46339 


242072 


1/2 


163769 


13368 


146452 


11955 




264453 


9085 


236489 


8124 


1 


oo 


3342 


oo 


2989 



Remark 5.6. Note however that squaring for example (13) in order to arrive 
at an expression such as ([!]) , C 2 needs to be multiplied with something like 1 + /3 
(using for example Young's inequality), but (3 > can be chosen very small. 



Asymptotically, in the sense of remark |5.3| we could gain about a factor e. 
So optimal lower bounds using our methods are given by: 



17747 < 



C 



15985 < 



C 



41 
2 



1449 < 



1305 < 



D 



Proof of lemma 5. 1 First note that 



E 



x 



E 

3=1 



1 771 



3 = 1 



We will do the proof for the low entropy isometry constants, then the same 
conclusion will hold for the restricted isometry constants since they are always 
smaller. Let U = {u € R^; ||u||i < y/s, ||u|| 2 < 1}, by the symmetrization 
inequality (prop. 4.2), Fatou's lemma and the definition of 5 S (as in proposition 
|3.7[ b), a similar definition holds for the low entropy isometry constants when 
we take supremum over the larger set IA), we get 



E5 : 



2 1 1 



E sup 

vl£U 



E 



IP0,u)| 2 -||u|| 



2d 



< 2 2n E E e sup 



-E e ^> u > 



3=1 



2n 
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where e = {£j}JLi is a Rademacher sequence. Let us now fix a realization of 
the Xj —: Xj and define 



En 



E £ sup 



3=1 



2m x 



l/2n 



, soE5 2n <E[(2E 2n (Xyf 



By lemma |4.4[ for every u <E U there exists a gridpoint z k e := 
(2- 2k yfi)Z N n Bi(0, Vs), (where £i(0 )A /s) = {z e R w : ||z||i < y^} and 
since Wc{u£ R w ; ||u||i < ^/s, ||u|| 2 < 1}) such that for any p > 1, 

dx,2 P (u,z fc ) < r k (p). 

For every z, k s G& consider 

B x ,2 P (zk,r k ) = {z G K N : d x ^ p (z,z k ) < r k (p)}. 

If U n i?x.2p(zfc, rfc) ^ 0, pick an arbitrary element from this set and denote it 
7TfcU, then we get a finite cover of W with balls 5^(^11,2^). We will do this 
for I < k < L where I and L are to be determined. Denote by U k :— {ir k \i : 
u 6 U k+ i} and note that < \G k \ < N k < oo. Now we get using telescoping 
sums, and the conventions = U, Hl+iU = u 

m m L+1 m 

5>|(x„u)| 2 = £ J2 ^(Kx^n^p-Kx^n^i^i^+^e.Kx^n^)! 2 

3=1 3=1 fc=; + i 3=1 



^ m 



3 = 1 



< 



1 m 

-^(Ite.ujf-lte.nzu) 



3 = 1 



E 



^ m 



3 = 1 



^ m 



3=1 
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where II^u = irk o irk+i o • • • o ttlu. Then we get 

/ 2n\ V2« 



E2n — 



E E sup 



V 



-Sejl<xj»u>| 2 



< 



E s sup 



l/2n 



S su ,p 



fe=J+l 



uew fc 



E e sup 



-^^•(Kx^^l'-Kx^TTfe-iu)! 2 ) 

^ m 

-H e jl( x j' n i u >l 



+ 

2n\ V2" 



m 



: S'l+i + Si + ix + Si. 



In order to estimate Sz+i,l we introduce 



g k (e,u) := 



^ m 

— 5Z £ j(K x j» u >| 2 - K x j^fe-iu)| 2 ) 



, ueWj; and 



/ fc (e) := sup ff fe (e,u). 



We also specify norm notations using 

||/||.,2n := (E, |/| 2 ") 1/2n , we can write 



e,2n 



We will derive auxiliary estimates for Si, \\fk\\e,2n and Sl+1j summarized in 

Lemma 5.7. For any non-negative integers I < k < L, there are p > q > 1 
(depending on K and s), ^ + ^ = 1, smc/i f/iai /or any positive integer n the 
following estimates hold: 



Si < 



(2K 2 sn 
\ m 



1/2 



Q3/4 Nl )l/2n S l/q 



m 



||/fc||e,2n < 

Sl+i < {2 7 - 2L K 2 sp) l ' 2 S 1 ^ 
where Nk > \Uk\ an d 



1/2 /(2 3 / 4 iV fc ) 1 /«\ 1/2 



S 1 ^ 



(14) 

(15) 
(16) 



S -S(x) = 3 (ig|<x,,u>|") 



1/2 



13 



Proof of lemma 5. ? There are many similarities in proving the above estimates. 
If we first consider Sf n for a fixed II; u it follows by Khintchine's and Holder's 
inequalities, that 



■^f-VI;i;iM-)f] < 



me / I m * — ' 

' \ 3=1 

n/p 



2n 
mc 



m ' 
i=l 

(m \ " /P / m 

< 2 *r ^Ei(x,n lU )i^j 
/ i - \ 

(X 2 S )"/p sup-^|(x j7 u)| 2 



n/9 



2n 
me 



2 3/4 

2 3/4 / 2A- 2 an 
\ mc 



|2« 



< 



< 



2 3/4 



\ me 



(K 2 s) n/p S 2n/q . 



After the second two lines we simply used that Hx^H^ < K and ||II;u||i < y/s 
and thus | (x.,- , II; u) | 2 < K 2 s. Since the derived estimate holds for any II;U € Ui 
we can use the trivial inequality 

E £ sup |/(e,u)| <E. \f(e,u)\<N k A 
ueUk ueu k 

which holds whenever E e |/(e, u)| < A and \U k \ < N k to get 



fn < Nl . 2 3/4 ( 2&Sny {R 2 s) „/ PjS 2„ 



\ me 



In the proof of (15), we will choose p large enough to ensure (K 2 s) 1 ^ p < c. 
Taking this into account, combined with taking the 2n:th root of the above 
inequality, shows ( 14 ) : 

V m 



1/2 



(23/4^1/2^1/^ 
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In the same manner one shows for fixed u £ Lik 
E £ g k ( £ ,u) 2n < 

2 3/4 ' 2n 

me 



2 3/4 



(\ n/p 
1 - \ 
-^(K^^I-Kx^Trfc-iU)!)* • 
3=1 / 

i^Kx^^i + Kx^Tr^u)!) 2 "] < 

3=1 / 

m \ 
sup- V(2|( Xj ,u)|) 2< M < 

2 3/4 f 2»\ (2 rfe _ 1 (p)) 2 "4"(X 2 s) n/p 5 2n/<? ) = 
\me ) 



2 3/4 



me 2 I 



where we plugged in rk-i(p) = 2 1 k 2 s p K (^r) 1 ^ 2 from the remark following 
lemma 4.4 Since the above is valid for all u€Wt,we get (similarly as for 5;) 



< 



2 w - 2k K 2 spn^ 1/2 



where N k are also chosen as in the remark following lemma |4.4| Choosing 
p = \n(2 3 / 4 K 2 s), ensures that {2 3 / 4 K 2 s) 1 l 2p — e 1 / 2 which concludes the proof 
of (HSl. 
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Lastly, fixing u € Li, using Cauchy-Schwarz and Holder's inequalities, 

2n 



-X]^(l<Xj,u}| 2 - |<x i!7 riu) 



1 



< 



,2n 



E, 



1/2 



1/2 



^(Kx^u)! 2 -!^,^)! 2 ) 2 K> 2 



vi =1 



n/p 



n/q 



-J2(\(^^)\-\(^^Lu)\) 2 A -£(|(x,,u)| + |(x,,7r L u)|) 2 « 



< 



(4r L (p)) 2n (K 2 s) n /PS 2n/q 



2 7 - 2L K 2 sp 



(2 3 / i K 2 s) n / p S 2n/q = 

(2 J - 2L K 2 sp) n S 2n/q . 
Since this holds for any u EU, (16) follows by taking a 2n:th root. □ 



Comparing the bounds in (14) and (15) for k — I, one easily sees that 
choosing 



I := 



log 2 



2^ 
e 



<l.o 



2?p 
e 



implies that 



5; < 



2 w - M K 2 snp 



1/2 



^ (2 3/4 ^) i/ " y /2 g i /q 



Next we will define an increasing sequence {nk}^ = i by 

ln(2 3 / 4 7V fc ),lni 



nk — max 



implying that (2 3//4 iVfc) ™* < e. Choosing n = m, p = ln(2 3 / 4 if 2 s) in lemma 5.7 
and using that II • II, o.„, < 



\e,2m ^ II • \\E,2n k ,k > I we get after this step the estimates 

>10-2fc rr2 . \ V 2 



Si < 



r- * spni J sVi^AiS 1 '* 



||//c||e,2n < ||/fc||e,2rt fc < 



)10-2/c t>-2 



K z spn k 



1/2 



S^ 9 =: < fc < i. 
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Then by the triangle inequality we have shown 



k=l 



2 w K 2 sp 



1/2 



2k 



k=l 



Introducing the covering numbers Nf. from the remark after lemma |4.4| and 
observing that / > \ log 2 (2 5 p) , we have that if N > Ap (true by assumption) 



2 3 / 4 iV t , = 2 3 / 4 



/2iVe 



' 2 3/(2 7 p) e iV ( 1_ £_ 
16 + TV 



< 



This implies that 

L L 



\/2- 2fe n fe 



fc=z 



^y2- 2fc max{ln(v / 2iV A: ),ln(l/e)} < 

L 

^ max{ln 1/2 (A7;p), 2~ fc ln 1/2 (l/e)} < 

(L-l + l)ln^(N/ P ) + 1 ^^ (17) 

To get a bound on L we use the bound of Sl+i given by lemma 5.7 The right 
hand side of (16), and hence also Sl+i, is less than or equal to XgS 2 if and 
only if 

1 



so we choose 



L> -lo 



L = 



/ 2 9 X 2 sp 

1 /2 9 X 2 sp 
2 l0g2 lW 



By the above estimates on I and L we get 



L—l + l<- log 2 



2 9 K 2 sp\ 1 



(A 5 ) 2 



" 2 l0g2 



2?p 
e 



3 = 



In 



2 In 2 V ( A 3) 2 



1/2 



Plugging this into (17), we have shown 

L 



2 In 2 



In 



2 b e^s 



1/2 



\n 1 / 2 (N/p)+ [ -M ln^^l/e) 
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Thus 



Set now 

1/2 



2 8 K 2 s \ 1 ( 1/2l {2 6 eK 2 s\ , 1/2/ . . , 1/2/ . „.\ cln2 
(lA^F J ( /p) + (1/£ } J ' ° = 

so that what we have shown can be expressed by 

&2n — &l + &>l + x t L + OL + 1 < ^ 1 2 ~~ 2 

Plugging stochastic rows Xj back in 5 = S(X) we have shown 
E 5 2n = E [(2£ 2 „(X)) 2 "] < (H + Xg) 2n E S 2n/q = 

(H + Xg) 2n E sup (if^K^.u)! 2 ] 



-f]|(^,u)| 2 -||u||2 + ||u||n < 

(H + Xg) 2n E [(6 a + l) n '«\. 

This finally implies that 
E [S 2n ] 1/n < (H + Xg) 2 (E [5™ /q ] q/n + l) 1/q < {H + Xg) 2 {E [6 2n ] 1 / 2n + l)^ q , (18) 
which concludes the proof of lemma |5.1| □ 

6 Appendix 

6.1 Proof of theorem 13.81 

The proof of this theorem requires some simple lemmas. 

Lemma 6.1. Let A be an m x TV -matrix satisfying the RIP-estimate with 
constants S s and x, y £ C N be vectors such that | suppx U suppy| < 2s and 
(x, y) = 0. Let \t\ < 1 be such that 

\\Axf 2 -\\xf 2 =t6 2s \\x\\ 2 

then, 

K^,^y)|<<Wl-* 2 ||x|| 2 ||y|| 2 . 
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Proof. We can assume ||x|| 2 = ||y||2 = 1- Pick a > 0,(3 > 0,7 = ±1 and 
consider vectors ax + 7V and /3x — jy, then 



||ax + 7 y||5 = a 2 + l 
||/?x-7y||2=/? 2 + l 
\\A(ox + jy)\\ 2 2 = a 2 \\Ax\\ 2 + \\Ay\\l + 2a 7 (Ax, Ay) 
m08x - 7 y)|| 2 . = /3 2 ||Ax|| 2 + p y || 2 - 2/3 7 (^x, Ay) 

Furthermore since A satisfies the restricted isometry property 

P(ax + 7 y)|| 2 - ||ax + 7 y|| 2 < S 2s \\ax + 7 y|| 2 
P(/3x - 7 y)|| 2 - ||jflx- 7 y||2 < - 7 y|| 2 .. 



(19) 
(20) 
(21) 
(22) 



(23) 
(24) 



Subtracting (24 1 from (23 1 and plugging in ( 19 )- ( 22 ) we get 



{a 2 - ^ 2 )||Ax|| 2 + 2 7 (a + p)(Ax,Ay) - a 2 + p 2 < S 2s {a 2 + (3 2 + 2) <^> 
2 1 (a + (5)(A X ,Ay) < (p 2 - a 2 )(Px|| 2 - ||x|| 2 ) + S 2s (a 2 + /3 2 + 2) <=> 

a 2 (l-t)+/3 2 (l + <) + 2 



7 (Ax,Ay) < <5 2s - 



2(a + j8) 



Since this holds for 7 = ±1 and if we set f(a,/3) = - ^ ^a+ff)" 1 "^"*" 2 we nave 
shown 

|<Ax,Ay>| < 6 2s f(a,[3). 
Finally we find the minimum value of / in the first quadrant to be yi — t 2 at 



the critical point (a,/3) = (\J \ 



1+t 
t 1 



i+* ) ■ 



Hence 



|(Ax,Ay)| <<5 2s yr^||x|| 2 ||y|| 2 . 

The following result can be found in [T) (proposition 2.1). 
Lemma 6.2. Suppose x = [x\, x 2 , . . . , x s ), Xi > x 2 > ■ ■ ■ > x s > 0, then 

1 



□ 



x 2 < 



, s =l|x||i + ~x s ). 



Lemma 6.3. Assume x = (xi, . . . , Xjy) £ C such that \xi\ > \x 2 \ > ••• > 
|ijv|. Wrife x = 2fe x Sfc w ' lere Si = {1, .. -,t},Sk ={t+(k-2)s + l,...,t + 
(k — l)s}, > 1, so that \Si\ = t, \Sk\ = s,k> 1 (except for possibly the last k), 
then 

1 



k>l 
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Proof. 



1 J~s 
||x<?J| 2 < ^=||x s J|i + — (|ar s+(fe _2) t+ i| - |ar a+ (jfe_i) t |) < 



1 y/s 

"7|ll x sJ|i + — (|z s+ (fc-2)t+il - |a; s +(fc_i)t+i|) 



by lemma 6.2 Summing this over k > 1 gives (since Sfc n Si = 0, k =/= I) 
( 1 

\\*S k h < ( -^ll x sj|i + -^-(\x s +(k-2)t+i\ ~ \x a+ ( k _ 1)t+1 \) 

k>l ""^1 



fc>l 



□ 

Proof of proposition \3.5\ The proposition follows by lemma |6.3| if with t = s 
since then we can estimate the last term in the inequality with 

1, 



\x s +i\ < -||x Sl ||i. 



Using this one gets 
II** 111 



<||x5 1 || 2 <^||x s J| 2 <^=||x Sl | 
fc>l v 



1 + 57l 



||x Sl ||i < ||xso||i. 

It is now clear that the same holds for any subset S C [N] with \S\ = s. 



□ 



Proof of theorem 3.8 Take A and t as in lemma 6.1 and x = {xs fc } as in lemma 



6.3| (with \Sk\ = s,k = 1,2, ... , except for possibly the last k) such that Ax = 0. 
Then we get since H-AxsJH > (1 — t6 2s )\\xs 1 1|| that 

(l-t*2.)||x Sl ||i < \\Ax Sl g < (Ax Sl ,-Ax st ) < £(Ax Sl ,A(-x s J) 



fc>i 



^^.V^^Hxftlla^ll^Jla ||x s J 2 <^^£||xsJh 



fc>i 



2.s 



k>l 



Now we use lemma 



6.3 and the inequality IjxsJIi < -^/sHx^ || 2 . 



llxgJIi K S 2s VT^ l 



IxsJi 1- 



1 - t5 2s Vs 



l x Sf||l + ^ll x Si 111 



4(1 - t5 2s ) 



< llxsflli 



l-tS 2s ' 
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It follows that HxgJIi < ||xsj||i (i.e. the null space property is fulfilled) if 

\ 4(1 - tS 2s ) ) > 1 - tS 2s ^ 5 > 1 - tS 2s ' 

Now observe that the minimum of the right hand side is attained at t — S 2s and 
hence we want 

4 S 2s 

which is fulfilled as long as 5 2s < -)=. □ 



6.2 Improving on theorem 3.8 



Here we will sketch out the details for our, so far, best improvement of theo- 



rem [3T8J The first step involves to replace t and s in lemma 6.3 with [6s/5] 
and [4s/5j respectively, where s > 2 is an integer. We also introduce S = 
{1,2, . . . , s} C Si. Then we have 



' ^llxs-Hl - (W\sl|l - ^T^\X[6 S /S]+1\)) < /A = = \M\l- 



(25) 

The last inequality follows since 

1 1 |4s/5| . . , r „ /rl .. |4s/5j . 

l|x Sl \5l|l 4 |^[6s/51+l| > C| 6S/5| -S)|X|- 6s /5 1+ i| 1 |2! [68/51+1 1 = 



Observe that if 5 divides s, we may replace , 1 with The improvement 

y4s/5— 1 V 4 

of proposition 3.5 becomes: 

Proposition 6.4. Assume x = (xi, . . . ,Xn) € suc/i t/ia< |xi| > |x2| > 
••• > |xiv| a^rf i/iai s > 2 is an integer. Write x = x Si- where Si = 
{1,..., [6s/5]},5 2 = {[6s/5] + 1,..., [6s/5] + L4s/5j} etc. so that \Si\ = 
[6s/5], \Sk\ = [4s/5j,fc > 2 (except for possibly the last k).Then if 



\xsth < jWs-iX! Il x sj| 2 , 

v s fc>i 



z£ ZioWs i/ia£ ||xs||i < ||xsc||i /or a^Z subsets S C [AT] wi£/i | jS^J = s. In particular 
if 5 divides s, 

||X Si || 2 < j g^llXsjb 
fc>l 
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is sufficient. 

Proof. If S = {1, . . . , s} C then by ((25 1 

llxslli 



< X S 2 



< ||x Sl ||a < 4=V4s/5-l^||x s J| 2 < 4=1 
^ s fc >i ^ s 



xsH i- 



□ 



Now we can simply modify the proof of theorem 3.8 in the previous section 
in a rather obvious way to find that 



4,-5/ a 
82, < i V 



,2 < s, 5 does not divide s 



^ 3 , 2 < s, 5 divides s 

implies that the matrix A with restricted isometry constants S s satisfies the null 



space property of order s. The combination of the result of theorem 3.8 (which 
is better for small s) with the improved one above can be summarized in the 
following figure 



0.68 



0.67 



0,66 



0,65 



0.64 



0.63 



0.62 




_l I_ 



_l I_ 



20 40 60 80 100 120 140 160 180 200 



Figure 1: Plot of optimal bounds of the constants 62s for s = 1, . . . , 200, implying 
NSP. For the smallest s, 4/V41 is best, while if 5 divides s, 2/3 will do. For 

larger s that is not divisible by 5 an upper bound is given by \J g~^ s s ■ 
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